Instance Selection to Improve Gamma Classifier

نویسندگان

  • Jarvin A. Antón Vargas
  • Yenny Villuendas-Rey
  • Itzamá López-Yáñez
چکیده

Pre-processing the dataset is an important stage in the Knowledge Discovery in Datasets (KDD) process. Filtering noise through instance selection is a necessary task. With this, the risk to use misclassified and non-representative instances to train supervised classifiers is reduced. This study aims at improving the performance of the Gamma associative classifier, by introducing a novel similarity function to guide instance selection. The experimental results, over 15 datasets, include several instance selection methods, and their influence in the performance of Gamma classifier is analyzed. The effectiveness of the proposed similarity function is tested, obtaining good results according to classifier accuracy and instance retention ratio.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Instance Selection in the Performance of Gamma Associative Classifier

The Gamma associative classifier is among the most used classifiers of the alpha-beta associative approach. It had been used successfully to solve many Pattern Recognition tasks, including environmental applications. However, as most classifiers, Gamma suffers with the presence of noisy or mislabeled instances in the training sets. This paper evaluates the impact of using instance selection tec...

متن کامل

Evaluation of Classifiers in Software Fault-Proneness Prediction

Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...

متن کامل

Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets

Objective(s): This study addresses feature selection for breast cancer diagnosis. The present process uses a wrapper approach using GA-based on feature selection and PS-classifier. The results of experiment show that the proposed model is comparable to the other models on Wisconsin breast cancer datasets. Materials and Methods: To evaluate effectiveness of proposed feature selection method, we ...

متن کامل

Improving Cascade Classifier Precision by Instance Selection and Outlier Generation

Beside the curse of dimensionality and imbalanced classes, unfavorable data distributions can hamper classification accuracy. This is particularly problematic with increasing dimensionality of the classification task. A classifier that can handle high-dimensional and imbalanced data sets is the cascade classification method for time series. The cascade classifier can compound unfavorable data d...

متن کامل

MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection

Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Polibits

دوره 54  شماره 

صفحات  -

تاریخ انتشار 2016